Performance analysis and Optimisation of the Met Unified Model on a Cray XC30
نویسندگان
چکیده
The Unified Model (UM) code supports simulation of weather, climate and earth system processes. It is primarily developed by the UK Met Office, but in recent years a wider community of users and developers have grown around the code. Here we present results from the optimisation work carried out by the UK National Centre for Atmospheric Science (NCAS) for a high resolution configuration (N512 ≈ 25km) on the UK ARCHER supercomputer, a Cray XC-30. On ARCHER, we use Cray Performance Analysis Tools (CrayPAT) to analyse the performance of UM and then Cray Reveal to identify and parallelise serial loops using OpenMP directives. We compare performance of the optimised version at a range of scales, and with a range of optimisations, including altered MPI rank placement, and addition of OpenMP directives. It is seen that improvements in MPI configuration yield performance improvements of between 5 and 12%, and the added OpenMP directives yield an additional 5-16% speedup. We also identify further code optimisations which could yield yet greater improvement in performance. We note that speedup gained using addition of OpenMP directives does not result in improved performance on the IBM Power platform where much of the code has been developed. This suggests that performance gains on future heterogeneous architectures will be hard to port. Nonetheless, it is clear that the investment of months in analysis and optimisation has yielded performance gains that correspond to the saving of tens of millions of core-hours on current climate projects.
منابع مشابه
CP2K Performance from Cray XT3 to XC30
CP2K is a powerful open-source program for atomistic simulation using a range of methods including Classical potentials, Density Functional Theory based on the Gaussian and Plane Waves approach, and post-DFT methods. CP2K has been designed and optimised for large parallel HPC systems, including a mixed-mode MPI/OpenMP parallelisation, as well as CUDA kernels for particular types of calculations...
متن کاملEstimating the Performance Impact of the MCDRAM on KNL Using Dual-Socket Ivy Bridge Nodes on Cray XC30
NERSC is preparing for its next petascale system, named Cori, a Cray XC system based on the Intel KNL MIC architecture. Each Cori node will have 72 cores (288 threads), 512 bit vector units, and a low capacity (16GB) and high bandwidth (~5x DDR4) on-package memory (MCDRAM or HBM). To help applications get ready for Cori, NERSC has developed optimization strategies that focus on the MPI+OpenMP p...
متن کاملAlgebraic Multigrid on a Dragonfly Network: First Experiences on a Cray XC30
The Cray XC30 represents the first appearance of the dragonfly interconnect topology in a product from a major HPC vendor. The question of how well applications perform on such a machine naturally arises. We consider the performance of an algebraic multigrid solver on an XC30 and develop a performance model for its solve cycle. We use this model to both analyze its performance and guide data re...
متن کاملOptimising Hydrodynamics applications for the Cray XC30 with the application tool suite
Power constraints are forcing HPC systems to continue to increase hardware concurrency. Efficiently scaling applications on future machines will be essential for improved science and it is recognised that the “flat” MPI model will start to reach its scalability limits. The optimal approach is unknown, necessitating the use of mini-applications to rapidly evaluate new approaches. Reducing MPI ta...
متن کاملAnalysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q
In this paper, we examine the performance of a suite of applications on three different architectures: Edison, a Cray XC30 with Intel Ivy Bridge processors; Hopper and Cielo, both Cray XE6’s with AMD Magny–Cours processors; and Mira, an IBM BlueGene/Q with PowerPC A2 processors. The applications chosen are a subset of the applications used in a joint procurement effort between Lawrence Berkeley...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1511.03885 شماره
صفحات -
تاریخ انتشار 2015